Principal-component-analysis eigenvalue spectra from data with symmetry-breaking structure.

نویسندگان

  • D C Hoyle
  • M Rattray
چکیده

Principal component analysis (PCA) is a ubiquitous method of multivariate statistics that focuses on the eigenvalues lambda and eigenvectors of the sample covariance matrix of a data set. We consider p, N-dimensional data vectors xi drawn from a distribution with covariance matrix C. We use the replica method to evaluate the expected eigenvalue distribution rho(lambda) as N--> infinity with p=alphaN for some fixed alpha. In contrast to existing studies we consider the case where C contains a number of symmetry-breaking directions, so that the sample data set contains some definite structure. Explicitly we set C=sigma2I+sigma(2)Sigma(S)(m=1)A(m)B(m)B(T)(m), with A(m)>0 for all m. We find that the bulk of the eigenvalues are distributed as for the case when the elements of xi are independent and identically distributed. With increasing alpha a series of phase transitions are observed, at alpha=A(-2)(m), m=1,2,..., S, each time a single delta function, delta(lambda-lambda(u)(A(m))), separates from the upper edge of the bulk distribution, where lambda(u)(A)=sigma(2)[1+A][1+(alphaA)(-1)]. We confirm the results of the replica analysis by studying the Stieltjes transform of rho(lambda). This suggests that the results obtained from the replica analysis are universal, irrespective of the distribution from which xi is drawn, provided the fourth moment of each element of xi exists.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Limiting Form of the Sample Covariance Eigenspectrum in PCA and Kernel PCA

We derive the limiting form of the eigenvalue spectrum for sample covariance matrices produced from non-isotropic data. For the analysis of standard PCA we study the case where the data has increased variance along a small number of symmetry-breaking directions. The spectrum depends on the strength of the symmetry-breaking signals and on a parameter α which is the ratio of sample size to data d...

متن کامل

Spectra of massive QCD Dirac Operators from Random Matrix Theory: all three chiral symmetry breaking patterns

The microscopic spectral eigenvalue correlations of QCD Dirac operators in the presence of dynamical fermions are calculated within the framework of Random Matrix Theory (RMT). Our approach treats the low–energy correlation functions of all three chiral symmetry breaking patterns (labeled by the Dyson index β = 1, 2 and 4) on the same footing, offering a unifying description of massive QCD Dira...

متن کامل

PCA learning for sparse high-dimensional data

– We study the performance of principal component analysis (PCA). In particular, we consider the problem of how many training pattern vectors are required to accurately represent the low-dimensional structure of the data. This problem is of particular relevance now that PCA is commonly applied to extremely high-dimensional (N 5000–30000) real data sets produced from molecular-biology research p...

متن کامل

Combined Unfolded Principal Component Analysis and Artificial Neural Network for Determination of Ibuprofen in Human Serum by Three-Dimensional Excitation–Emission Matrix Fluorescence Spectroscopy

This study describes a simple and rapid approach of monitoring ibuprofen (IBP). Unfolded principal component analysis-artificial neural network (UPCA-ANN) and excitation-emission spectra resulted from spectrofluorimetry method were combined to develop new model in the determination of IBF in human serum samples. Fluorescence landscapes with excitation wavelengths from 235 to 265 nm and emission...

متن کامل

Combined Unfolded Principal Component Analysis and Artificial Neural Network for Determination of Ibuprofen in Human Serum by Three-Dimensional Excitation–Emission Matrix Fluorescence Spectroscopy

This study describes a simple and rapid approach of monitoring ibuprofen (IBP). Unfolded principal component analysis-artificial neural network (UPCA-ANN) and excitation-emission spectra resulted from spectrofluorimetry method were combined to develop new model in the determination of IBF in human serum samples. Fluorescence landscapes with excitation wavelengths from 235 to 265 nm and emission...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Physical review. E, Statistical, nonlinear, and soft matter physics

دوره 69 2 Pt 2  شماره 

صفحات  -

تاریخ انتشار 2004